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Abstract 

We develop a dynamic multi-agent model of an interbank payment 
system where banks choose their level of available funds on the basis of 
private payoff maximisation. The model consists of the repetition of a 
simultaneous move stage game with incomplete information, incomplete 
monitoring, and stochastic payoffs. Adaptation takes place with bayesian 
updating, with banks maximizing immediate payoffs. We carry out nu- 
merical simulations to solve the model and investigate two special sce- 
narios: an operational incident and exogenous throughput guidelines for 
payment submission. We find that the demand for intraday credit is an 
S-shaped function of the cost ratio between intraday credit costs and the 
costs associated with delaying payments. We also find that the demand for 
liquidity is increased both under operational incidents and in the presence 
of effective throughput guidelines. 
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1 Introduction 



Virtually all economic activity is facilitated by transfers of claims towards public 
or private financial institutions. The settlement of claims between banks takes to 
a large extent place at the central bank, in central bank money. These interbank 
payment systems transfer vast amoimts of funds, and their smooth operation is 
critical for the functioning of the whole financial system. In 2004, the amnial 
value of interbank payments made in the European TARGET was around $552 
trillion, in the US Fedwire system $470 trillion, and in the UK CHAPS $59 
trillion - tens of times the value of their respective gross domestic products (BIS 
2006). These transfers originate from customer requests, and from the banks' 
proprietary operations in e.g. foreign exchange, securities and the interbank 
money market. The sheer size of these transfers, and their centrality for the 
functioning of a number of markets, make the mechanisms that regulate these 
fluxes and the incentives that generate them interesting to policy makers and 
regulators. 

At present most payment systems work on a real-time gross settlement 
(RTGS) or equivalent modality. In RTGS payments are settled continuously 
and individually throughout the day with immediate finality. To cover the pay- 
ments banks generally use their reserve balances, access intraday credit from 
the central bank or use incoming funds from payments from other banks. The 
first two sources carry an (opportunity) cost which gives banks incentives to 
economize on their use. We call these funds liquidity. The third source, on the 
other hand, is dependent by the liquidity decisions of other banks. The less 
liquidity a bank commits for settlement, the more dependent it is from incom- 
ing payments - and may thus need to delay its own payments until these funds 
arrive, causing the receivers of its payments to receive funds later. If also delays 
are costly, each bank faces a trade-off between liquidity costs and delay costs. 
Both aspects are dependent on the banks own liquidity decision, but the latter 
is also dependent on the liquidity decisions by other banks. 

This paper develops a dynamic model to study this trade-oS^. The model 
consists of a sequence of independent settlement days where a set of homogenous 
banks make payments to each other. Each of these days is a simultaneous- move 
game (or a stage game) in which banks choose their level of liquidity for payment 
processing. At the end of the day they receive a stochastic payoff determined 
by the amount of liquidity they committed and delays they experienced. Due 
to the nature of the settlement process, the payoff function is a random variable 
unknown to the banks. In this context, a reasonable assumption is that banks 
use heuristic, bounded-rational like rules to adapt their behaviour over time. 
Hence, we simulate a learning process taking place over many days, until banks 
settle down in equilibria. We are interested in the properties of the equilib- 
ria in aggregate terms, i.e. in the behaviour of the system as the product of 
independent, single agents' private payoff maximization. 

Given its game-theoretic approach, this paper is related to recent work by 
Angelini (1998), Bech and Garratt (2003, 2006), Buckle and Campbell (2003) 
and Willison (2004). These study various "liquidity management games" with 
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few (typically, two) agents and few (typically, three) periods. There, however, 
the payoff function is common knowledge. Due to the complex mechanics taking 
place in real payment systems this is likely to be unrealistic. Recent work 
by Beyeler et al. (2007) on the relationship between instruction arrival and 
payment settlement in a similar setting shows that with low liquidity, payment 
settlement gets coupled across the network and is governed by the dynamics 
of the queue - and largely unpredictable when a large number of payments 
are made. The present paper makes an effort to model this complexity; in a 
similar spirit, it also considers a large number of banks, which settle payments 
in a continuous-time day, and which interact over a long sequence of settlement 
days. 

Recently, a growing literature has used simulation techniques to investigate 
the effects e.g. of failures in complex payment systems (see eg. BoE (2004), 
Leinonen (2005), Devriese and Mitchell (2005)). These studies generally use 
historical payment data and simulate banks' risk exposures under alternative 
scenarios, or ways to improve liquidity efficiency of the systems. The shortcom- 
ing of this approach has been that the behaviour of banks is not endogenously 
determined. It is either assumed to remain unchanged or to change in a prede- 
termined manner. 

The present paper tries to overcome some of the shortcomings of both " game 
theoretic" and "simulation" approaches by modelling banks as learning agents. 
Agents who learn about each others' actions through repeated interaction is a 
recurring theme in evolutionary game theory. In one strand of the literaturefl] the 
agents know their payoff function, and learn about others' behaviour. They do 
so playing the stage game repeatedly, while choosing their actions on the basis 
of adaptive rules of the type " choose a best reply to the current strategy profile" 
or " choose a best reply to the next expected strategy profile" . Results obtained 
in this strand cannot be immediately applied here: banks cannot choose best 
replies as they do not know their payoff function. A second research line does 
not require knowledge of the payoff function on the part of the learners; they 
are instead of the kind "adopt more frequently an action that has produced 
a high payoff in the past" . The main results of this literature are about the 
convergence (or non-convergence) of actions to equilibria of the stage game. 

The approach adopted here is close to the latter. However, because the pay- 
offs are calculated on the basis of a settlement algorithm, we cannot analytically 
calculate the equilibria ex-ante, and then demonstrate convergence (or the lack 
of it). Instead, we show convergence by means of simulations, inferring then 
that the attraction points are equilibria of the stage game - in a sense that 
we make precise. Because the payoff function is stochastic and unknown, the 
problem of each optimizing bank lends itself to a heuristic approach. From this 
perspective, our work bears strong links to the reinforcement learning litera- 
turqj. From an individual agent's perspective it relates it relates to operations 
research, where a typical problem is that of maximizing an unknown function. 

^E.g. fictitious play, following Brown (1951) 

■^See Sutton and Barto (1998) for an overview. For Q-Learning, a common reinforcement 
learning technique, see Watkins and Dayan (1992). 
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However, in our setting the environment is not static: through time, actions 
yield different payoffs both because the payoff function is random, and because 
the other agents change their behaviour. 

The model is rich enough to investigate a number of policy issues; here, we 
focus on the aggregate liquidity of the system. As a first result we derive a 
liquidity demand function, relating total funds to the ratio of delay to liquidity 
costs. This function is found to be increasing to the relative cost of delay, 
and S-shaped. Then, we look at the effect of operational incidents affecting 
random participants of the system. We find that banks would generally prefer 
to commit more liquidity in case the disruption were known - except from the 
extreme cases of very low and very high delay costs. Throughput guidelines 
for payment submission are a common used by system-designers to reduce risk 
in payment systemsjl we look at the effect of one such rule on liquidity usage 
and find that at sufficiently low delay costs banks would increase their liquidity 
holdings to contain delays. Finally, we explore some efficiency issues, namely 
whether smaller systems are more or less liquidity efficient than large ones. We 
find that a system with a smaller number of banks uses less liquidity for a given 
level of payment activity. 

The paper is organised as follows: Section 2 develops a formal description of 
the model and the agents' learning process, and describes the payoff function. 
Section 3 presents the results of the experiments and section 4 concludes. 

2 Description of the model 

2.1 Stage game and its repetition 

The model consists of N agents indexed by i = l...iV, who repeatedly play a 
stage game T = {A, 711,112, ■■■ttn)- Here A = {0, 1, ...A'} is the (common) finite 
action set for each agent, and tt^ : A'^ — > is i's outcome function, which 
maps the set of action profiles into a set of payoff distribution functions. That 
is, given the action profile a e A'^ , agent i receives a stage-game payoff drawn 
from a univariate distribution tt^ (a), whose shape depends on N parameters - 
the stage game action profile. To keep the exposition uncluttered, we leave the 
precise form of the outcome function tt,; (.) undefined at the stage. Details are 
given in Section [3. 11 where we also give a precise economic interpretation to the 
abstract entities introduced here. Information in the game is incomplete as the 
outcome function 7r(.) is unknown to the agents. Agents are risk-neutral, so 
they care about the expected payoff. Hence, bank i will only be concerned with 
its payoff functions fi (a) = E (iTi (a)). 

The stage game T is repeated through discrete time, running from i = to 
(potentially) infinity. The action profile chosen in stage game t is denoted by 
a (t) = {ai (t) , 02 (t) , aN (t)}. A particular realization of the payoff vector 

•^A throughput guideline is a constraint imposed on banks' behaviour by the system regu- 
lator; typically, it demands that certain percentages of the total daily payments be executed 
by given deadlines within the day. 
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drawn from tt (a {t)), is indicated by y (t) E M^, which is therefore also called 
the "game-i payoff' . 

Monitoring is incomplete. At the beginning of (stage-) game t, each agent i 
knows the following: all its own previous choices and realized payoffs, and some 
statistics of other's past choices a_i (fc). A (observed) history (by i at time 
t) is thus denoted by h* = {a; (fc) ,yi (fc) ,a_i (fc)}fc^o t-i- ^* ^^^^ 

set of all possible histories that i may observe up to t, and let us define H = 
UiJ*. Differently from the literature on repeated games, but more in line with 
that on evolutionary game theory, we assume that agents aim at maximizing 
immediate payoffs (instead of e.g. the discounted stream of payoffs). That 
is, histories are essential to learn about payoffs and about others' actions, but 
agents disregard strategic spillover effects between stage games. This seems a 
sensible assumption here: the complexity of the environment makes it unlikely 
that agents anticipate all interactions Q 

2.2 Information, learning and strategies 

Agent i faces two forms of uncertainty: uncertainty about the payoff function 
given others' actions, and uncertainty about other's actions. The first element 
gives to our model a flavour of decision theory, the second one is a game theory 
issue. 

2.2.1 Information 

As time goes by and histories are updated, agents can be seen to accumulate 
information. More formally, we posit that of the whole history observed up to t 
each i retains some multi-dimensional statistics, say pi (t). These are the beliefs 
about the state of the environment that i is learning, and it constitutes the basis 
for the definition of strategies. Here, pi (t) is composed of two parts: 

a) an "estimate" /* (.) of the payoff function fi (a) = £' |a]l£| 

b) an "estimate" pi (.) of probabilities for other agents' actions in the next 
stage game. 

Of a whole action profile a, i only observes its own action and a statistic 
a-i which correlates with a_i. Hence, we assume that the estimate /* (.) assigns 
to each {ai,a-i) an expected payoff. As for the estimate b), each i is assumed 
to maintain static expectations about others' actions. That is, i believes that 
a_ j is drawn from a time-invariant distribution, as if other agents were adopting 
a constant mixed strategy. We adopt this assumption, a classic in evolutionary 
game theory, because it is simple and because it yielded the same results as some 

^In the realm of reinforcement learning, immediate payoff maximisation where actions are 
associated with situations is referred after Barto, Sutton and Brouwer (1981) as associative 
search. 

^Risk-neutral player^ are only interested in the expected value of payoffs, so there is no 
gain in assuming that / instead maps actions into payoff probability distributions. 
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more sophisticated forms of belicfsl^ Because action profiles that generate the 
same statistic ct-i are indistinguishable to i, the estimate p* (.) also refers to a_i 
instead of a^. In the simulation, we posit that a_i is the average action of the 
"other" agents, which clearly takes values in [0, if]. So, approximating to the 
nearest integer, (a_i) is a vector with K entries, collecting the probabilities 
of any of the (other agents') average action being played. 



2.2.2 Learning 

The information stored in /* (.) and (.) is updated as time goes by, according 
to learning rules. A learning rule for agent i, denoted by A^, assigns to each 
observed history an updated pi {t). 

Define Ik {ai,a_i) as the indicator function equal to 1 if action profile {ai,a^i) 
appears at time k and zero otherwise. We use the following learning rule: 

Jtf - ^ Y.k=o...t-iViik)h{ai,a^i) 

fi i^u a^i) = — . . - X — (1) 

l^k=a...t-i^k (at,a^i) 

("-^^ = rn^ 

In words, fl (ai,a-i) is the average payoff obtained under action profile 
(ai,a-i) up until time t excluded. Similarly, the components of the vector 
(a_i) are calculated according to the observed frequencies, starting from an 
initial estimate This is known as the "fictitious play" updating rule start- 

ing from a uniform estimate; it corresponds to Bayesian updating of beliefs 
about a constant, unknown distribution over the other agents' actions^ 



2.2.3 Strategies 

A strategy for i is a map assigning to each pi{t) an action to be taken, i.e. 
some ai (t). A particular strategy can be seen as motivated by some "rationale" , 
resting in turn on the basis of a learning process which we now describe. 

Each i is risk-neutral and aims at maximizing the expected immediate payoff. 
Because i believes that the opponents play a particular a_i with probability 
p'i (a-i), its strategy dictates: 

ai {t+l) = argmaxi? f- {ai,a^i) |p* 

= argmax /* (ai,a_j)p* (a_i) (3) 

ai ^- — ' 

li — i 

The fact that banks maximize their immediate payoff is only one of the 
many possible preference specifications. Alternatively, agents might also be 

^We explored in particular the possibility that players believe that the opponents' actions 
follow a Markov process. In this case, the estimate under 2) is a transition matrix, containing 
the probabilities of a particular a_i being played at t + 1, conditional on a = (a^, a_i) being 
observed at t. 

^See e.g. Fudenberg and Levine (1998) pg. 31 for details. 
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Figure 1: Information, learning and strategies 



taking into consideration future payoffs. In this case, however, optimal strate- 
gies would be far more complex. Indeed, discounting expected future payoffs 
would create an implicit trade-off between exploitation (the use of actions that 
appear optimal in the light of the available information), and exploration (the 
use of seemingly sub-optimal actions, which might appear such because of lack 
of experimentation). Our preference specification severs this payoff-related link 
between stage games, which nevertheless are interrelated because learning takes 
place across them. This short-sighted maximization assumption is common in 
the bounded-rationality and evolutionary game theory literature|f| 

Figure 1 illustrates the relationship between histories, information and strate- 
gies. A history up to 2 is summarized in pi{t—l). This, along with the new 
data obtained in t — 1, is updated into pi (t) by the learning rule A^. In turn 
pi (t), which is the information available at is mapped by a strategy Si, to an 
action (t) . 

It should be noted that in early repetitions of the stage game, p* is heavily 
influenced by the initial (arbitrary) estimate, for which we simply use 1/K, 
while /* (.) is the average of a few observed payoffs only. Hence, strict adoption 
of Eq. [3] would most likely yield, and possibly lock into, sub-optimal actions. 
To avoid this, we suppose that agents first randomly choose a certain number of 
actions to explore the environment, and then start making choices as in Eq. [3] 
- which we call "informed decisions". To ensure further exploration, each agent 
also tries itself out at least once every a_,i that it encounters. This models 

^Fudenberg and Levine (1998) contains an authoritative review of models with myopic 
agents. To quote only some of these seminal contributions, see i) the literature on Fictitious 
Play by Brown (1951), Foster et al. (1998), and Krishna et al. (1998), ii) the literature on 
learning and bounded rationality inc. Kandori et al. (1993), Young (1993), Ellison (2003), 
and Blume (1993), iii) studies on Imitation and Social Learning by e.g. Schlag (1994). 
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learning from other agents. 

These choices on the length of the "exploration phase" are evidently arbi- 
trary; however, in the model there are clearly no exploration costs, so the length 
of the exploration period can indeed be assumed exogenously. On the other 
hand, some limit to exploration must be imposed, as the sheer size of the action 
spaces inhibits a brute force approach, whereby i collects a very large sample 
of all possible action profiles (and respective payoffs) before making informed 
decisions]^ 

2.3 Specification of the payoff function 

The model of learning about an unknown stochastic payoff function that is 
determined party by the agent's own actions and partly by the actions of other 
agents can lend itself to a number of applications. The specification of the payoff 
function fi ties it to the problem of a payment system analyzed in this paper. 

One possible specification of / could be a simple analytical function of the 
players' actions. The problem would then become that of analyzing the lim- 
iting behaviour of the learning rules and strategies, something that could be 
done analytically, provided / is simple enough. However, in quest of increased 
realism in payment system modelling we specify / via an algorithm represent- 
ing a "settlement day" with a large number of daily payments. To understand 
why realistic analytical functions are difficult to develop, consider the follow- 
ing: imagine first that banks have always enough liquidity to make payments 
instantaneously. In this case payments flow undisturbed, delay costs are zero, 
and only liquidity costs matter. Their calculation is trivial. However, if banks 
commit less funds for settlement (as banks want to minimize costs), it becomes 
more likely that the funds are at some point insufficient for banks to execute 
payments immediately. As shown in Beyeler et ai. (2006), these liquidity short- 
ages cause payments to occur in " cascades" , whose length and frequency bears 
no correlation with the instruction arrival process that regulates payment in- 
structions as the settlement of payments becomes coupled across the network 
when incoming funds allow the bank to release previously queued payment. As 
a consequence, the flow of liquidity and thus delay times for individual banks 
become largely unpredictable. 

In the model represents any external funding decision by the bank. The 
funds allow the bank to execute payment instructions, which the bank receives 
throughout the day according to a random process. Banks have costs for both 
committing liquidity for settlement, and from experiencing delays in payment 
processing due to insufficient funds. The settlement day is modelled in contin- 
uous time, with time indexed as i G [O,T]0 At any time interval dt, bank 

^In the simulations, each i chooses among 40 possible actions, and 40 are the possible 
average actions by the "others" (a_i). Thus, full exploration would require observing 40^^ = 
1600 different action profiles (ai,a_i), each of which should be sampled enough times to obtain 
a reliable estimate of / (.). 

^"In the previous section, t indicizes "days", but we feel there is no risk of confusion, as 
Section 2 and the present are relatively independent. 
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i receives an instruction to pay 1 unit to any other bank j with probability 
jjjjzr^dt. Because there are N such banks z, and — 1 "other" banks j, the 
arrival of payment instructions in the whole system is a Poisson process with 
parameter 1 so that, on the average day, T payment instructions are generated. 

Payments are executed using available liquidity; I's available liquidity at 
time t is defined as: 

h (t) =ai+ {yi (s) - Xi (s)) ds 
Js=a 

where Xi (s) (viz. (s)) is the amount of Vs sent (viz. received) payments 
at time s. For simplicity, we assume that every i adopts the following payment 
ruleEll 

at each t, execute instructions using First-in-First-out (FIFO) as long as k (t) > 0; 
else, queue received instructions (4) 

We assume that a payment instruction received by bank at t and executed 
at t' carries a cost equal to 

CD = K^^^^jfi K>0 (5) 

where k is the "daily interest cost" of delaying payments. Similarly, liquidity 
costs (e.g. opportunity cost of collateral) are linear: 

CL = Xa,, A>0 (6) 

Finally, the stage game payoff is the sum of the costs in Eq. [5] and those 
in Eq. [51 the former summed up over all i's delayed payments. 



3 Experiments 

3.1 Parameters and equilibria 

The continuous-time settlement day is modelled as a sequence of 10"* time units 
indexed t € [0, 10'']. Given that the arrival of payment instructions is a Poisson 
process with parameter 1, on average banks receive a total of 10^ payment 
instructions per day. A sequence of days (stage games) is called a play. In the 
simulations, we terminate a play when no bank changes its liquidity commitment 
decision for 10 consecutive days (convergence). We run 30 plays for each set of 
model parameters and find that convergence always occurs. 

Banks start the adaptation process with random decisions for liquidity, and 
gradually accumulate information on the shape of the payoff function. When 

^^The rule under ^ is evidently optimal for the cost specification given here. As banks 
need to pay upfront for liquidity, they have no incentive to delay payments if liquidity is 
available. Under other cost specifications (e.g. priced credit or heterogeneous payment delay 
costs) this would, however, not be the case. 
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enough information has been collected, banks adopt the rule described in Eq. 
3 for making decisions on liquidity to commit. A series of stage games is ended 
in the simulations when no bank changes its collateral posting decision for 10 
consecutive games. This means that at this point, no bank wishes to change its 
action, given the information available and given other banks' actions 

Suppose that the payoff function were known by the banks; given our spec- 
ification of strategies, it would then be clear that the converged-to actions are 
a Nash equilibrium of the stage gamel^ We cannot quite draw the same con- 
clusion in our setting: the payoff exploration is necessarily partial, so it might 
be that some profitable actions were never tested enough to be recognized as 
such. Hence, the equilibria converged to are only "partial" -Nash equilibria, or 
Nash-equilibria conditional on the "partial" information that banks have about 
the payoff function. However, as we discuss later, we observe a clear consistency 
in learning. This suggests that the partiality of the information collected is suf- 
ficient, and the equilibria reached are probably good approximations of the true 
stage-game Nash equilibria. 

The base system consists of 15 banks. In section 3.2 we investigate the 
impact of the system size. Banks choose their action, among forty different 
levels, ranging from to 80 in intervals of 2. The cost functions are as in Eq. [S] 
and Eq. [HI We normalize liquidity costs at A = 1 and look at different values 
of delay costs k ranging from 1/8 to 512 in multiples of 4. We are interested in 
the demand for liquidity (i.e. in the choices of a,) at different values of n, and 
in the resulting settlement delays and payoffs. 

3.2 Base experiment 

As expected, with low delay costs banks tend to commit low amounts of liquidity 
(~50 units) and delay payments instead, and at high cost of delay bank prefer 
to commit plenty of liquidity (~1044 units) in order to avoid expensive delays. 
Figures 2 and 3 illustrate typical evolution of the simulations for two extreme 
levels of delay costs. The sudden changes in liquidity correspond to the point 
where banks start making informed decisions (see Section [2.2.3[) . 

We find that convergence always occurs - on an aggregate level within a 
narrow range. A priori, learning might be sensitive to initial observations, and 
hence it might be subjected to drastic differences in the final "conclusions" . The 
consistency of the learning process is illustrated in Figure 4, where we plot the 
converged-to value of ^ (i.e. the total liquidity committed) across plays, for 
different parameter specifications. Due to randomness - which makes histories 
necessarily different - the "learned" liquidity level clearly differ, but they 
do so within small ranges. 

^^While some changes in actions may occur due to the randomness of payoffs and learning, 
these did not qualitatively change the results in simulations with longer convergence criteria. 

^^This simple property of Fictitious Play stems from the fact that, if for all t > t' the action 
profile is some constant a, then the estimates pi converge to the true value -^Sai. Strategies 
prescribe playing a best reply to pi so, if a were not a Nash Equilibrium, sooner or later one 
player would choose some a'^ ^ o.i. 



10 



1200 
1000 



"S 800 
E 

I 600 

to 
"D 

i 400 



200 




1 11 21 



31 41 51 
stage game 



61 71 81 



Figure 2: Total liquidity - low delay costs 




1 51 101 151 201 251 301 
stage game 

Figure 3: Total liquidity - high delay costs 



11 



cost ratio 

1200 -I 1 




1 30 0.125 

play 



Figure 4: Total equilibrium liquidity across plays 



It should be noted that while the system consistently " learns" the same level 
of total liquidity, this can represent many configurations of single banks' liquidity 
choices. Hence, our simulation don't show that always the same equilibrium 
is reached; rather, that the equilibria that are reached are characterized by 
a narrow span of total liquidity in the system. Given the symmetry of the 
model, it is clear that for any equilibrium (i.e. any equilibrium profile of actions 
(oi, a2...aAr)), there are many other equilibria obtained via a permutation of 
the actions between the players that yield to same total liquidity on the system 
level. 

Another interesting feature of the model is its ability to match a well known 
empirical fact: a low ratio of available liquidity to daily payments ("netting 
ratio"), which in turn implies high levels of liquidity recycling. Because the 
system processes on average 10.000 payments a day, the above results imply 
that the ratio in our simulation is between 0.5% and 10.4%. For comparison, 
CHAPS Sterling's netting ratio is 15% (James 20040 and in Fedwire as low 
as 2.2%13 The real netting ratios are bound to be higher due to the fact that 
payments in them are of varying sizes in contrast to the more fiuid unit size 
payments modelled here. 

Figure 5 shows the equilibrium demand for intraday credit as a function of 
the cost ratio. This function is S-shaped in the exponential delay cost scale, that 

^''Calculated as the ratio of collateral used for intraday credit to the value of payment 
settled. 

'^^in 2001. Calculated as (balances + mean overdrafts) / total value. Sources: 
www. federalreserve. gov / paymentsystems / fedwire 
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delay cost 

Figure 5: Liquidity as a function of delay costs 



is, it is relatively flat at both low and high levels of delay costs. At comparatively 
low delay costs banks evidently commit little liquidity; hence the return for 
increasing liquidity holdings are high, and so a little more liquidity suffices to 
cope with increased delay costs. As a consequence, for low delay cost levels the 
demand curve is flat. Consider now the situation with high delay costs. There, 
the liquidity committed is high and returns to increasing liquidity are low, so 
one might think that an increase in delay costs calls for high extra amounts 
of liquidity. However, this is not the case, because gains from liquidity indeed 
diminish above a certain level when all payments can be made promptly. Hence, 
for high delay costs, liquidity demand is insensitive to further increases in delay 
costs, and the demand curve is flat again. In between these two extremes, the 
demand for liquidity increases exponentially with delay costs. 

We find that delays in the system increase exponentially as banks reduce 
the amount of liquidity when this is relatively expensive compared to delaying 
payments. The phenomenon is known as "deadweight losses" (Angelini 1998) or 
"gridlocks" (Bech and Soramaki 2002) in payment systems. Figure 6 shows the 
relationship between system liquidity and payment delays. In intuitive terms, 
the reason of this exponential pattern is the following. First, a bank that reduces 
its liquidity holdings might have to delay its outgoing payments. Second, as a 
consequence, the receivers of the delayed payments may in turn need to delay 
their own payments, causing further downstream delays and so on. Hence, a 
decrease of a unit of liquidity may cause multiple units to be delays. Third, 
such a chain of delays - and hence this multiplicative effect - is more likely and 
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Figure 6: Delays as a function of liquidity 



longer, the lower the liquidity possessed by the banks. Thus the total effect of 
liquidity reduction acts in a compounded (exponential) fashion. 

An interesting question is how good the performance of the banks is in 
absolute terms. To understand this we compare the payoffs received by the 
banks through adaptation with two extreme strategies: 

a) all banks delay all payments to the end of the day; 

b) all banks commit enough liquidity to be able to process all payments 
prompt ly0 

The comparison between the performance of these two pure strategies and 
the learned strategy is shown in Figure 7. For any cost ratio, the adaptive 
banks obtain better payoffs than any of the two extreme strategies - except for 
the case with high liquidity costs when the costs are equal. Banks manage to 
learn a convenient trade-off between delay and liquidity costs. On the contrary, 
the strategy under a) becomes quickly very expensive as delay costs increase, 
and the strategy under b) is exceedingly expensive when delays are not costly. 

3.3 Impact of network size 

In order to investigate the impact of system size on the results presented in the 
previous section, we ran simulations varying the number of participants. To en- 
sure comparability, we kept the number of payments constant across simulations 

^^In fact the liquidity committed in the simulation with the highest delay cost was used as 
the scenario for prompt payment processing. 
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Figure 7: Comparison of strategies: a) min liquidity; b) min delays 



and maintained the network complete. 

We observe that liquidity demand increases with the system size, and the in- 
crease is more pronounced the higher the delay costs. For example, the demand 
function is unchanged at low delay cost while, for high delay costs, a 50-bank 
system requires 215% the liquidity needed in a 15-bank system. 

Similar results hold about delays. The relationship between liquidity com- 
mitted and delays remains close to exponential irrespective of system size; how- 
ever, larger systems experience more delays, for any level of initial liquidity (see 
Figure 9). 

An intuitive explanation of these phenomena could be the following. First, 
note that if the number of participants is increased by a factor x (keeping 
turnover constant), the volatility of the balance of each bank is multiplied by 
a factor 1/x' > 1/x - we show this in a moment. Second, suppose that i) the 
optimal ai is proportional to the volatility of a bank's balance 6 (i.e. — zSV^ 
and that ii) banks post all the same amount of liquidity (i.e. ai = aj). It then 
simply follows that the total amount of liquidity increases with the system's 
size: (nx) Zp- > nzS (here nx is the number of banks in the larger system, and 
the corresponding volatility of balances) . 

The key point is that, if the number of participants is increased (keeping 
turnover constant), the volatility of banks balances rises more than proportion- 
ally. To see why this is the case, consider the simplified but illustrative situation 



^^This is exactly the case if a bank chooses ai as to cover z "standard deviations" from the 
average balance. 
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Figure 8: The effect of system size on tlie demand for liquidity 



wliere liquidity is abundant, so there are no delays. In this bank's net 

position is the sum of a series of random perturbations (incoming and outgoing 
payments), equally likely to affect it positively and negatively. In other words, 
a bank's net position is a random walk, whose value after n perturbations av- 
erages zero, with a standard deviation -^n. By increasing the system size by 
a factor x, the orders are distributed over more banks, so the average number 
of perturbations for any given time interval is multiplied by l/x < 1. Accord- 
ingly, the standard deviation of the balances at the end of any time interval is 
multiplied by l/x' — \fljx > Xjx. 

3.4 Throughput guidelines 

Some interbank payment systems have guidelines on payment submission jointly 
agreed upon by the system participant^^. The rationale for throughput guide- 
lines is to induce early settlement in order to e.g. reduce operational risk or 
perceived coordination failures among participant. For example, if a large chunk 
of payments are settled late in the day, an operational incident would be more 
severe as more payments could potentially remain unsettled before close of the 
payment system and the financial markets where banks balance their end-of-day 
liquidity positions. 

^*E.g. the FBE (Banking Federation of the European Union 1998) has sot guidehnes on 
the timing of certain TARGET payments. In CHAPS Sterling, members must ensure that on 
average (over a calendar month) 50% of its daily value of payments are made by 12 pm and 
that 75%, by value, are made by 2.30 pm. (James 2004). 
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We simulate a particular realization of throughput guidelines by introducing 
an additional penalty charge for delays that last longer than one tenth of the 
settlement day (i.e. 1000 time units). The penalty charge is set to 64, in order 
to sufficiently penalize non-compliance with the rules. 

Figure 10 shows the impact of the throughput guidelines on the amount of 
liquidity committed by the banks. When delay costs are high, banks already 
commit enough liquidity to avoid long delays, so the throughput guidelines 
have no effect. They do however, in the case of low delays costs, induce banks 
to commit more liquidity. Not surprisingly, this comes at a cost to the banks, 
which are forced away from their first-best choice. The increase in costs are 
of the order of 70% at the lowest level of delay costs, and 20% at the second 
lowest. The payoff comparison is shown in Figure 11. 

3.5 Operational incident 

Short term outages by banks in the payment system are rare in actual payment 
systems, but do take place occasionally. In a typical scenario a participating 
bank experiences problems connecting to the system due to temporal unavail- 
ability of IT systems or telecommunication facilities. Due to the design of the 
payment systems, a disconnected bank can in such situations generally still 
receive payments to its account at the central bank, but cannot submit instruc- 
tions to pay from its account. Unless other banks stop paying to the troubled 
bank, it quickly becomes a liquidity sink, and the liquidity available for settling 
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payments at other banks is reduced 

In this set of simulations we ask the question of how much hquidity banks 
would wish to commit in such a situation, i.e. what is the impact of an opera- 
tional outage on the demand for intraday credit. The banks are assumed to be 
unaware of the possible incident, and unable to discriminate among their coun- 
terparts, so the intraday liquidity management rule under Eq. |4]is still adopted. 
Under these assumptions, we simulated a scenario where a randomly selected 
bank can receive, but cannot send payments for the first half of the settlement 
day. On average, this means that up to T/2N — 10.000/ (2 • 15) ~ 333 hquidity 
units cannot be used by other banks as a source of liquiditM^"!. Depending on 
the delay cost, this figure varies between 1900% and 30% of the average total 
liquidity injected in the system at the beginning of the day (the first figure being 
for the case when both delay cost and liquidity demand are low, and the second 
for when costs liquidity demand are both high). 

We found that the effect of operational incidents on the demand for liquidity 
is highest at a relative delay/liquidity cost ratio of 2 - hence fairly low in the 
range; at this point, the increase in intraday credit demand is 144 units, or 
an increase of 85%. It should be noted that banks do not compensate for 
the full amount of liquidity "trapped" by the distressed bank, but prefer to 
partly make up for that, and partly increase delays. For higher delay costs, 

^^see e.g. analysis on the impact of the 9/11 terrorist attacks in McAndrews and Potter 
(2002), Lacker (2004), and Soramiiki et al (2007) 

■^"The shortage of liquidity equals the number of payment orders received by the distressed 
bank, and not yet executed until the second half of the day. 
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delays remain approximately unchanged compared to the scenario without the 
incident. Finally, when delay costs are lower than liquidity costs (i.e. for a cost 
ratio <1), banks prefer to hold about the same amount of liquidity as without 
the incident, and experience the delays caused by the reduced liquidity. In 
this case, the impact of an operational incident increases both the demand for 
liquidity (but less than what was trapped) and delays - more so the one which 
is less costly. 

4 Summary and conclusions 

In this paper we developed an agent-based, adaptive model of banks in a pay- 
ment system. Our main focus is on the demand for intraday credit under alterna- 
tive scenarios: i) a "benchmark" scenario, where payments flows are determined 
by the initial liquidity, and by an exogenous arrival of payment instructions; ii) 
a system where, in addition, throughput guidelines are exogenously imposed iii) 
a system subject to operational incidents. 

It is well known that the demand for intraday credit is generated by a trade- 
off between the costs associated with delaying payments, and liquidity costs. 
Simulating the model for different parameter values, we were able to draw with 
some precision a liquidity demand function, which turns out to be is an S-shaped 
function of the delay / liquidity cost ratio. We also looked at the costs expe- 
rienced by the banks, as a function of the model's parameters. By the process 
of individual payoff maximization, banks adjust their demand for liquidity up 
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Figure 12: Delays as a function of cost ratio (b: incident) 



(reducing delays) when delay costs increase, and down (increasing delays) , when 
they rise. Interestingly however, the absolute delay cost remains approximately 
constant when the ratio delay/liquidity costs changes. As expected, the de- 
mand for intraday credit is increased by an operational incident. However, this 
effect is found to be important only if liquidity is costly compared to delaying 
payments. Likewise, throughput guidelines increase the demand for intraday 
credit - as banks try to avoid penalties for not adhering to them. In total this 
reduces the payoffs of the banks. Nevertheless, throughput guidelines may be 
beneficial when additional benefits that are not in the current model are taken 
into account (among these, benefits related to reducing operational risk). 

This model produces realistic behaviour, suggesting that it may be used to 
investigate a wide array of issues in future applications. A number of extensions 
are possible. First, alternative specifications for the instruction arrival process 
may be applied (see e.g. Beyeler at al. (2006)). Alternatively, one could change 
the assumptions on the banks' network: while the complete network assumption 
imphcitly adopted here fits well with e.g. the UK CHAPS system, an interesting 
question is how other topologies such as a scale free network topology such as in 
Fedwire (Soramaki et al.. 2007) would affect the results. Also, different individ- 
ual preferences could be investigated. We assumed that banks are risk neutral 
and interested in maximising their immediate payoffs; it would be interesting 
to verify if the introduction of risk aversion and / or preferences over expected 
stream of payoffs may change the results. Finally, more complex behaviour can 
be easily studied within our model; for example, the " pay-as-much-as-you-can" 
rule for queuing payments could be replaced by sender limits. Similarly, more 
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sophisticated strategics can be easily modelled, supposing e.g. that banks keep 
constant their actions for a number of periods (to gather more data and explore 
the environment), instead of exploiting after a fixed amount of time what seems 
to be the best action. 
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